by Ravi Shankar, Silicon Graphics
In December 1995, Silicon Graphics® demonstrated the first graphical class browser and static analysis tools to help understand JavaTM programs. In March, Silicon Graphics released CosmoTM Codeá1.0. This article discusses how Cosmo Code was developed and brought to market quickly, and how it will evolve and further aid in understanding large-scale Java applets and applications.
Java is an object-oriented programming language initially targeted for use in providing dynamic and interactive content for the World Wide Web. Its simplicity and increasing acceptance by the industry and the developer community has made it a language of choice for developing Web-based applications. One of its main attractions is the ability to write relatively small pieces of code to achieve impressive results. This is mainly facilitated by the rich set of libraries (of classes) that are uniformly supported on all platforms. Typically a new Java developer looks at existing Java programs and libraries to understand how to write new programs. Cosmo Code provides the functionality and features, based on static analysis techniques, that are valuable for understanding Java programs and libraries.
Static analysis is crucial to help programmers understand the structure of their programs. Static analysis differs from dynamic analysis in that it analyzes static source code, not an executing program. Silicon Graphics' ProDevTM WorkShop tools have provided this functionality for the past five years, helping developers port and understand their C, C++, FORTRAN, and Ada programs. With the advent of C++ and its widespread use, we felt that static analysis capabilities should be supported on object-oriented programs in a more rigorous fashion. We focused on two kinds of analysis: global and class-centered. Global analysis is helpful in finding out where a symbol is defined and used. A good example of a global analysis tool is AT&T's cscope. Class-centered analysis is helpful to understand the inner details of a given class and how that class relates to and interacts with other classes. In 1993, we introduced class-centered analysis to Silicon Graphics' ProDev WorkShop C++. We have now extended our language-smart static-analysis capabilities to Java.
The following sections describe Cosmo Code's features for global and class-centered analyses. In addition, an example program and the Queries section list some of the global and class-centered queries supported in Cosmo Codeá1.0.
A Cosmo Code user specifies a fileset for performing global analysis and class-centered analysis. A fileset typically contains a list of .class files that belong to an applet. Cosmo Code processes the fileset and presents all the classes found in the Overview card of the Query Deck, as shown in Figure 1. This card lists all classes and the source location where the class is defined. The context-sensitive menu shows the queries that can be performed on the selected class.
The classes listed include the applet's classes as well as the Java system and library classes on which the applet depends. Overview and Query cards provide an interface to perform global queries. These queries allow the user to find the definitions and uses of classes, methods, and variables. The results are organized as a spreadsheet. Each cell in the spreadsheet is individually selectable, and further queries are made on the item in the cell. Since each result has a source context, you can see the source text responsible for the result without loss of focus. By double-clicking the filename, you can view the entire file in the Source card with the source line highlighted. Here is a list of global and context-sensitive queries.
To understand object-oriented programs, you have to understand the key classes in the program and how they interact with one another. Typically, some classes provide data abstraction, whereas others provide services. To understand a class, not only do you need to view it, but you also want to be able to ask questions about it; for example, How does class A override class B? How does class D use class C? If you rely on the sources alone, you may have to see several sources simultaneously and flip back and forth to understand how the classes interact or how inheritance is used. A common solution to this problem has been to provide an inheritance graph and list of class contents for a selected class. While this solution may be necessary, it is not sufficient.
As a developer, user, or technical writer, you may want to understand one or more of the following aspects of classes:
You may also want to look at a set of classes in a sub-hierarchy to understand or to make inheritance-related changes. We think that the best way to gain this understanding is with a concise, comprehensive, customizable view of a class. This view should also facilitate simple navigation to sources and documentation. Cosmo Code provides one such view of a class in the Class card, which is the basis for class-centered analysis. Without using a lot of screen real estate, the Class card provides both a flexible view of the inner details of a class as well as a complete context in terms of the classes that it relates to through inheritance, implementation, and interaction.
Figure 2 shows a Class card with the Shape class displayed. When you double-click a class name in the Overview or Query card, the Class card displays the class on which you clicked. In the figure, inherited fields from the class java.lang.Object are hidden by preference. You can change the default outline ordering shown in the figure. In addition to the public, protected, and private access categories, you can classify data as restricted, which limits visibility within the package or source file.
The left side of the Class card in Figure 2 lists the class fields, whereas the right side lists related classes. The fields include inherited ones in addition to those that are explicitly defined or overridden by the class. Since every class in Java is derived from java.lang.Object, and the class hierarchies tend to be deeper than C++, you can set a user preference to show only fields that the class defines. All the fields are shown with their associated attributes, such as final, native, abstract, and synchronized.
The icons in the outline display let you collapse or expand the subset of fields or classes. To find out what java.awt.Graphics offers, turn off protected and private and look at methods under public. However, if you want to derive a new class from Circle you may want to see fields under the protected hierarchy. Refer to the sample applet for more information.
The context-sensitive menus of the Class card feature an option that brings up source corresponding to a variable, class, or method. Double-clicking a field also shows its definition in the source card; navigation to the source is inheritance-sensitive. For example, when you view the definition of a method, you see the appropriate definition for the current class.
The right side of the Class card displays an outlined list of related classes based on inheritance or interaction. The list shows the relationship that exists even among the related classes. For example, in Figure 2, the derived classes Line, Rect (rectangle), and Circle are at the same level. The class Disc is indented one level from Circle to indicate that it is derived from Circle. The relationships shown depend on whether the focus is a Java class or Java interface. The Class card indicates the kind of class or interface shown as one of Java class, Java abstract class, Java final class, or Java Interface. For a Java class, in addition to the base class, a list of interfaces it implements is shown and for a Java interface, a list of implementations of the interface is shown. The set of classes that directly interact (use a variable, use a method, or create an object) with the current class appear under the USES and USEDBY headings (see Figure 2). Double-clicking any class changes the view to that class, thereby providing a simple way to shift focus between either the user and used classes, or between the derived and base classes.
Documentation is essential in the understanding of library classes. The Class card allows you to display the appropriate HTML page and location in a Web browser. Menu options are available in the Class card to access either class or field HTML pages.
You can select every item displayed in the Class card to make an appropriate class-centered query, as shown in Figure 3. Class-centered queries differ from global queries in that they understand the scope of the search and they take the inheritance model into account. The results of class-centered queries are highlighted in the Class card. Class-centered queries allow users to focus on a class and perform analysis without changing the display every time a query is answered.
Cosmo Code supports three categories of class-centered queries:
The Queries section lists some of the class-centered queries.
Making changes relying on the global cross-reference results can be tedious. For instance, assume that you want to change the type or name of the variable width in the class Shape. The variable is defined in three classes in the Java AWT package, but you are interested only in the definition and use of Shape's width. Using the Shape class as the focus in the Class card, you can ask a single query to find out which classes and methods use the width field in the Shape class, as shown in Figure 3.
To make the query, you select the variable width and choose the What Accesses option. The result of the query is highlighted on both sides of the Class card to show classes that use width as well as Shape's own method setSize. A brief description of the query is shown along with the number of results.
Understanding object-oriented programs requires more than understanding class hierarchies. You need to understand which classes or methods create objects. Some architecturally key classes act as subsystems. These classes set up the objects and let them interact in response to external events. ShapeApplet, in our example, is one such class. Figure 4 shows objects that are created by the class ShapeApplet. Once you discover which classes are instantiated, you can pick a class and ask which methods in the ShapeApplet created it. You can also double-click on the method to see the source from which the object was created.
Figureá4 shows classes whose objects are instantiated by class ShapeApplet. You make a query by selecting <-This (current class) and choosing What is Instantiated. In this case, ShapeApplet instantiates seven classes.
Graphs of a set of classes based on inheritance or interaction provide a useful overall picture. Cosmo Code provides a class graph that shows all classes and their inheritance and interaction relationships. In the interaction mode, the relationship is based on usage of each other's fields among the classes. Sometimes these class graphs contain too many classes and arcs and become cumbersome. This is especially true for Java when you want to focus on your classes (since a lot of system classes are involved). Cosmo Code provides a way to limit the set of classes viewed to the classes that are directly or indirectly related to the class displayed in the Class card. A butterfly mode is also available to see immediate parents and children.
Figureá5 shows a class graph in which class Shape is the focus, and for which the option to view only the related classes is chosen. In this case, the relationship chosen is Inheritance.
Cosmo Code also supports a call graph for methods. Method call graphs present an overview of how methods make use of other methods in the same class as well as in other classes. This overview is recursive, ending in leaf methods that do not call any other method. Figure 6 shows the call graph for the method handleEvent in the class ShapeApplet.
You can display call graphs for methods by selecting a method in the Class view. Double-clicking on any node in the call graph makes the Source card show the corresponding method definition, as long as the source is available.
Two implementation problems exist: static analysis data collection and fast data access for GUI and interactive query processing. Interactive performance was one of the fundamental usability criteria for Cosmo Code static-analysis features.
Silicon Graphics' ProDev Workshop collects static analysis data by processing program source files with Silicon Graphics compilers. The compilers collect information from even syntactically incorrect programs. The compilers interact with a database utility to manage the data collected from a set of source files. Silicon Graphics' ProDev Workshop static analyzer tools use this data to support queries and program structure visualization.
To support another programming language, we would need a compiler that produces static analysis data and interacts with the database utility, which is what we initially intended for Java. However, we were not satisfied with the performance of the database utility, especially while gathering data. Moreover, reusing static analysis data about libraries that programs use proved challenging. We were looking for a faster mechanism that could reuse information about the libraries. By looking at the Java programs, we found that although users typically write some classes, they use even more from the Java libraries. To enable quick analysis of many applets, we had to rely on a scheme that reused the information about the Java library classes.
C++ programs often include header files of other library classes, so the compiler has direct access to the information about these classes. However, source files are not included in Java programs. We also could not rely on sources for the Java library classes to be available in every installation. The .class files (generated by the Java compiler) that are installed as part of Java libraries contain information about the classes. Also, the format of .class files accommodates the addition of new attributes.
We decided to process .class files to extract much of the static analysis information. It was easy to extract the class definition information. However, we needed significant cross-reference (usage) information to support the kind of analysis that Silicon Graphics' ProDev WorkShop tools provide. The .class files contain the byte codes for the methods, which we processed to gather basic variable and method cross-reference information. We extended this process to gather cross-reference information at the class level. We also changed the Java compiler to emit line number information required to support navigation to sources.
The current data-collection scheme is a two-stage process. In the first stage, all the .class files are processed to produce a corresponding static analysis data file called .jsud (Java symbol usage and definition). .jsud files contain information on the class definition, cross-reference, and source context information (if sources are available). In the second phase, called index-generation phase, an index file (.jsudindex file) is generated based on the .jsud files. The .jsudindex file contains information about the list of classes and overall information about their inheritance and interaction. In the indexing phase, the .jsud files for the dependent classes and libraries are generated or reused if they were already generated by a previous indexing phase.
It is essential that queries are answered quickly. The static analysis data files organize data and provide hash tables to enable quick access to the data-access and query processing modules in Cosmo Code. To process class-centered queries, the query processing module has an integrated class computation engine. The data access module first reads the index file and then the .jsud files, as guided by the index file, and based on demand to process queries. For example, only .jsud files of a class and its base classes are read to show a class in the Class card. The rest of the information is in the index file. However, when a query is made (such as which fields are used by another class), the index file is consulted to read other required .jsud files. In addition, the data in .jsud files is cached in memory. The interaction of these mechanisms is what enables Cosmo Code to respond interactively.
You can perform global queries from both the Overview and Query cards. Global queries include
You can perform context-sensitive queries on a cell in the results displayed in the Overview and Query cards. Context-sensitive queries include
Class-centered analysis is performed in the Class card. These are performed using the methods, variables, and classes shown in the Class card. Queries include:
Ravi Shankar (ravis@sgi.com) is a member of the Developer Magic group at Silicon Graphics, working on Cosmo Code programming environment for Java. Previously, he worked on the design and implementation of C++ class browser in Silicon Graphics' ProDev WorkShop suite of programming environment tools. He also worked on the design and implementation of Fix and Continue (lets users modify a running program and continue execution in the debugger) in the ProDev WorkShop debugger.
We welcome feedback and comments at
devprogram@sgi.com.